Skip to content

Fix: Secure TLS for Prometheus metrics endpoint#120

Open
malingatembo wants to merge 1 commit into
mainfrom
feat/OSPR-30557_tls-metrics-endpoint
Open

Fix: Secure TLS for Prometheus metrics endpoint#120
malingatembo wants to merge 1 commit into
mainfrom
feat/OSPR-30557_tls-metrics-endpoint

Conversation

@malingatembo

@malingatembo malingatembo commented Jun 24, 2026

Copy link
Copy Markdown

Followed pattern from openstack-k8s-operators/telemetry-operator
Commit: 6111e80d by Jaromir Wysoglad (2024-02-28)
TLS implementation for Prometheus metrics in OpenShift

How adapted here

Jaromir's approach: Optional TLS via API (flexible, supports cert-manager or service-ca)
Our approach: Always-on TLS with service-ca only

Core pattern (5 components):

  1. Service annotation: triggers service-ca to create certificate
  2. Volume mounts: makes certificate accessible to Pod
  3. Metrics server config: loads certificate from volume
  4. CA bundle ConfigMap: service-ca injects CA certificate
  5. ServiceMonitor : references CA, verifies certificate

Result:
same security outcome as reference implementation
simpler
automatic certificate management

Files Changed

6 source files + 4 auto-generated bundle files (via make bundle)

On branch feat/OSPR-30557_tls-metrics-endpoint

modified:   cmd/main.go
modified:   config/default/metrics_service.yaml
modified:   config/manager/manager.yaml
new file:   config/prometheus/ca-bundle-configmap.yaml
modified:   config/prometheus/kustomization.yaml
modified:   config/prometheus/monitor.yaml

Summary by CodeRabbit

  • New Features

    • Metrics and monitoring now use a proper TLS certificate setup, improving secure connections.
    • OpenShift can automatically provide serving certificates and CA bundle injection for metrics endpoints.
  • Bug Fixes

    • Replaced insecure monitoring TLS verification with certificate-validated connections.
    • Updated deployment configuration so the metrics service can reliably access its certificate files.

@openshift-ci openshift-ci Bot requested review from lpiwowar and umago June 24, 2026 10:00
@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: malingatembo
Once this PR has been reviewed and has the lgtm label, please assign umago for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The metrics server is updated to use OpenShift-provisioned TLS certificates instead of self-signed ones. The metrics Service is annotated to trigger OpenShift certificate generation into the operator-metrics-tls secret, which is mounted into the controller-manager pod. The Prometheus ServiceMonitor switches from insecureSkipVerify to CA bundle verification, a new CA bundle ConfigMap is added, and bundle manifests are regenerated to reflect these changes.

Changes

Metrics Server TLS Hardening

Layer / File(s) Summary
OpenShift cert provisioning and volume mount
config/default/metrics_service.yaml, config/manager/manager.yaml, config/prometheus/ca-bundle-configmap.yaml, config/prometheus/kustomization.yaml
Adds the service.beta.openshift.io/serving-cert-secret-name: operator-metrics-tls annotation to the metrics Service, mounts the resulting secret into the manager container at /tmp/k8s-metrics-server/serving-certs, and introduces a serving-certs-ca-bundle ConfigMap with the OpenShift CABundle injection annotation registered in the Prometheus Kustomize overlay.
Metrics server and Prometheus TLS config
cmd/main.go, config/prometheus/monitor.yaml
Sets CertDir, CertName, and KeyName in metricsserver.Options to load certificates from the mounted path, and replaces insecureSkipVerify: true in the ServiceMonitor tlsConfig with caFile and serverName: metrics.system.svc for verified TLS.
Bundle manifests sync
bundle/manifests/openstack-lightspeed-operator-metrics_v1_service.yaml, bundle/manifests/openstack-lightspeed-operator.clusterserviceversion.yaml
Regenerates OLM bundle artifacts: adds the serving-cert annotation to the bundled metrics Service, adds the cert volumeMount and volume to the CSV deployment spec, and bumps the CSV createdAt timestamp and builder annotation to operator-sdk-v1.41.1.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 Hop hop, the certs now flow,
From OpenShift's secret below,
No more skipping TLS checks,
CA bundles save our necks,
Metrics dance in verified glow! 🔐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main change: securing the Prometheus metrics endpoint with verified TLS.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/OSPR-30557_tls-metrics-endpoint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
config/prometheus/monitor.yaml (1)

18-19: 🎯 Functional Correctness | 🔵 Trivial | 🏗️ Heavy lift

Avoid hardcoding one rendered namespace into serverName.

The rest of this manifest set still uses the system placeholder namespace, but Line 19 pins hostname verification to metrics.openstack-lightspeed-system.svc. That only works for one rendered namespace; installing into any other namespace will break TLS verification against the service-ca cert. Please drive serverName from the same namespace transform/replacement path as the Service resource instead of baking in a single output.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/prometheus/monitor.yaml` around lines 18 - 19, The Prometheus TLS
config is hardcoding a rendered namespace in the serverName value, which will
break hostname verification when the manifest is installed into a different
namespace. Update the serverName in the Prometheus config to use the same
namespace substitution/replacement mechanism as the Service resource and the
rest of this manifest set, so it is generated dynamically from the target
namespace instead of being fixed to metrics.openstack-lightspeed-system.svc.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/main.go`:
- Around line 113-116: The metrics server setup is missing the TLS policy from
tlsOpts, so only the serving cert paths are being applied. Update the metrics
server options in main to also set TLSOpts: tlsOpts alongside CertDir, CertName,
and KeyName so the metrics endpoint inherits the same TLS 1.3 minimum and HTTP/2
disablement as the webhook server.

---

Nitpick comments:
In `@config/prometheus/monitor.yaml`:
- Around line 18-19: The Prometheus TLS config is hardcoding a rendered
namespace in the serverName value, which will break hostname verification when
the manifest is installed into a different namespace. Update the serverName in
the Prometheus config to use the same namespace substitution/replacement
mechanism as the Service resource and the rest of this manifest set, so it is
generated dynamically from the target namespace instead of being fixed to
metrics.openstack-lightspeed-system.svc.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a9dc39c-bf0c-4d32-8c47-657fb3555182

📥 Commits

Reviewing files that changed from the base of the PR and between d598c3f and cc98139.

📒 Files selected for processing (6)
  • cmd/main.go
  • config/default/metrics_service.yaml
  • config/manager/manager.yaml
  • config/prometheus/ca-bundle-configmap.yaml
  • config/prometheus/kustomization.yaml
  • config/prometheus/monitor.yaml

Comment thread cmd/main.go Outdated
@malingatembo malingatembo force-pushed the feat/OSPR-30557_tls-metrics-endpoint branch from de7f4f0 to 073deac Compare June 24, 2026 12:12
Remove insecureSkipVerify from ServiceMonitor and implement proper
TLS certificate verification using OpenShift service-ca.

Changes:
- Add service-ca annotation to metrics Service for automatic cert generation
- Mount certificate Secret in operator Pod
- Configure metrics server to use service-ca certificates
- Update ServiceMonitor with CA bundle and server name verification
- Create CA bundle ConfigMap for Prometheus
- Regenerate bundle manifests

Fixes: OSPR-30557

 On branch feat/OSPR-30557_tls-metrics-endpoint
	modified:   bundle/manifests/openstack-lightspeed-operator-metrics_v1_service.yaml
	modified:   bundle/manifests/openstack-lightspeed-operator.clusterserviceversion.yaml
	modified:   cmd/main.go
	modified:   config/default/metrics_service.yaml
	modified:   config/manager/manager.yaml
	new file:   config/prometheus/ca-bundle-configmap.yaml
	modified:   config/prometheus/kustomization.yaml
	modified:   config/prometheus/monitor.yaml
@malingatembo malingatembo force-pushed the feat/OSPR-30557_tls-metrics-endpoint branch from 073deac to 845c723 Compare June 24, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant